16 research outputs found

    Survey of Machine Learning Techniques for Malware Analysis

    Get PDF
    Coping with malware is getting more and more challenging, given their relentless growth in complexity and volume. One of the most common approaches in literature is using machine learning techniques, to automatically learn models and patterns behind such complexity, and to develop technologies for keeping pace with the speed of development of novel malware. This survey aims at providing an overview on the way machine learning has been used so far in the context of malware analysis. We systematize surveyed papers according to their objectives (i.e., the expected output, what the analysis aims to), what information about malware they specifically use (i.e., the features), and what machine learning techniques they employ (i.e., what algorithm is used to process the input and produce the output). We also outline a number of problems concerning the datasets used in considered works, and finally introduce the novel concept of malware analysis economics, regarding the study of existing tradeoffs among key metrics, such as analysis accuracy and economical costs

    Android Malware Family Classification Based on Resource Consumption over Time

    Full text link
    The vast majority of today's mobile malware targets Android devices. This has pushed the research effort in Android malware analysis in the last years. An important task of malware analysis is the classification of malware samples into known families. Static malware analysis is known to fall short against techniques that change static characteristics of the malware (e.g. code obfuscation), while dynamic analysis has proven effective against such techniques. To the best of our knowledge, the most notable work on Android malware family classification purely based on dynamic analysis is DroidScribe. With respect to DroidScribe, our approach is easier to reproduce. Our methodology only employs publicly available tools, does not require any modification to the emulated environment or Android OS, and can collect data from physical devices. The latter is a key factor, since modern mobile malware can detect the emulated environment and hide their malicious behavior. Our approach relies on resource consumption metrics available from the proc file system. Features are extracted through detrended fluctuation analysis and correlation. Finally, a SVM is employed to classify malware into families. We provide an experimental evaluation on malware samples from the Drebin dataset, where we obtain a classification accuracy of 82%, proving that our methodology achieves an accuracy comparable to that of DroidScribe. Furthermore, we make the software we developed publicly available, to ease the reproducibility of our results.Comment: Extended Versio

    Towards a Near-real-time Protocol Tunneling Detector based on Machine Learning Techniques

    Full text link
    In the very last years, cybersecurity attacks have increased at an unprecedented pace, becoming ever more sophisticated and costly. Their impact has involved both private/public companies and critical infrastructures. At the same time, due to the COVID-19 pandemic, the security perimeters of many organizations expanded, causing an increase of the attack surface exploitable by threat actors through malware and phishing attacks. Given these factors, it is of primary importance to monitor the security perimeter and the events occurring in the monitored network, according to a tested security strategy of detection and response. In this paper, we present a protocol tunneling detector prototype which inspects, in near real time, a company's network traffic using machine learning techniques. Indeed, tunneling attacks allow malicious actors to maximize the time in which their activity remains undetected. The detector monitors unencrypted network flows and extracts features to detect possible occurring attacks and anomalies, by combining machine learning and deep learning. The proposed module can be embedded in any network security monitoring platform able to provide network flow information along with its metadata. The detection capabilities of the implemented prototype have been tested both on benign and malicious datasets. Results show 97.1% overall accuracy and an F1-score equals to 95.6%.Comment: 12 pages, 4 figures, 4 table

    Share a pie? Privacy-preserving knowledge base export through count-min sketches

    No full text
    Knowledge base (KB) sharing among parties has been proven to be beneficial in several scenarios. However such sharing can arise considerable privacy concerns depending on the sensitivity of the information stored in each party’s KB. In this paper, we focus on the problem of exporting a (part of a) KB of a party towards a receiving one. We introduce a novel solution that enables parties to export data in a privacy-preserving fashion, based on a probabilistic data structure, namely the count-min sketch. With this data structure, KBs can be exported in the form of key-value stores and inserted into a set of count-min sketches, where keys can be sensitive and values are counters. Count-min sketches can be tuned to achieve a given key collision probability, which enables a party to deny having certain keys in its own KB, and thus to preserve its privacy. We also introduce a metric, the γ-deniability (novel for count-min sketches), to measure the privacy level obtainable with a count-min sketch. Furthermore, since the value associated to a key can expose to linkage attacks, noise can be added to a count-min sketch to ensure controlled error on retrieved values. Key collisions and noise alter the values contained in the exported KB, and can affect negatively the accuracy of a computation performed on the exported KB. We explore the tradeoff between privacy preservation and computation accuracy by experimental evaluations in two scenarios related to malware detection

    An architecture for semi-automatic collaborative malware analysis for CIs

    No full text
    Critical Infrastructures (CIs) are among the main targets of activists, cyber terrorists and state sponsored attacks. To protect itself, a CI needs to build and keep updated a domestic knowledge base of cyber threats. It cannot indeed completely rely on external service providers because information on incidents can be so sensible to impact national security. In this paper, we propose an architecture for a malware analysis framework to support CIs in such a challenging task. Given the huge number of new malware produced daily, the architecture is designed so as to automate the analysis to a large extent, leaving to human analysts only a small and manageable part of the whole effort. Such a non-automatic part of the analysis requires a wide range of expertise, usually contributed by more analysts. The architecture enables analysts to work collaboratively to improve the understanding of samples that demand deeper investigations (intra-CI collaboration). Furthermore, the architecture allows to share partial and configurable views of the knowledge base with other interested CIs, in order to collectively obtain a more complete vision of the cyber threat landscape (inter-CI collaboration)

    Towards the usage of invariant-based app behavioral fingerprinting for the detection of obfuscated versions of known malware

    No full text
    App fingerprints can be used to verify whether two apps are the same, and are useful tools for malware detection because they can allow to recognize obfuscated versions of known malware. Fingerprinting an app on the base of static features is known to fail against obfuscation, as it is successful in hiding the static characteristics that reveal the malicious nature of an app. In this paper we propose a novel way to compute app fingerprints, which is based on behavioral features. The aim is to capture the semantics of the app, so that obfuscation results ineffective. The technique we introduce exploits invariants, found among pairs of metrics, collected during app execution, and produces a fingerprint consisting of the list of the correlation values of these pairs. We present an experimental evaluation carried out on a real Android device, whose obtained results support the methodology we propose, and show it can be a viable research direction to investigate further

    Ultrasound evaluation of the uterus in the uncomplicated postpartum period: a systematic review

    No full text
    The aim of this systematic review and meta-analysis was to define the means and the upper limits of normal for endometrial thickness and uterine measurements in uncomplicated pregnancies at different postpartum periods

    The goods, the bads and the uglies: Supporting decisions in malware detection through visual analytics

    No full text
    Malware associated with Web downloads is responsible for many attacks trying to execute malicious code on a remote machine. Web browsers are protected by anti-malware utilities that try to distinguish between good downloads and bad downloads, blocking the bad ones and alerting the user. In order to cope with the uncertainty of such a process, very often the final decision is made using suitable thresholds, giving rise to a 3 categories classification: good downloads, bad downloads, and “in the middle” downloads (i.e., the uglies). In this situation, it is possible to involve the user (e.g., the security manager) in the decision loop, presenting him with the details of the decision process in a way he can either be more confident about the system decisions or he can refine what has been done automatically, e.g., promoting an ugly download to a good one. The paper addresses this problem presenting a visual analytics solution supporting the analysis of the classification system presented in AMICO [24], providing the user with a better understanding of the classification decisions and the possibility of changing the classification results. A prototype is available at: http://awareserver.dis.uniroma1.it:11768/malvis/
    corecore